-
-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add /v1/project/batchDelete API method that deletes with SQL #4383
base: master
Are you sure you want to change the base?
Add /v1/project/batchDelete API method that deletes with SQL #4383
Conversation
Signed-off-by: Mikael Carneholm <[email protected]>
) | ||
@ApiResponses(value = { | ||
@ApiResponse(responseCode = "204", description = "Projects removed successfully"), | ||
@ApiResponse(responseCode = "207", description = "Access is forbidden to the projects listed in the response"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sold on this status code. Some surface-level research leads me to believe using this outside of WebDAV is not intended: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/207
Wouldn't it be better to atomically fail the request if deletion of one or more projects failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, the 207 indeed seems to be reserved for WebDAV. The reason for implementing it this way was based on the outcome of this discussion, where the suggestion by sebD was to return a 204 with a body containing the list of inaccessible projects. The 207 was my suggestion, and since I got a thumbs up I thought I would do it the same way in this PR. I see a value in letting the client know which projects that are inaccessible, so I'm leaning towards going with sebD's suggestion instead of just returning a 401. Would that be an acceptable solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think sebD's original comment was geared towards the situation where we proceed with deleting accessible projects when there have been inaccessible projects in the provided list. In that case you do need to communicate the partial success somehow.
My point is that no projects should be deleted at all, if at least one of them is inaccessible. You can then still respond with 401
, and you could use ProblemDetails
to communicate the inaccessible projects in a machine-readable way.
We do this for tag endpoints already:
- Persistence operations can throw a
TagOperationFailedException
, e.g.dependency-track/src/main/java/org/dependencytrack/persistence/TagQueryManager.java
Lines 306 to 318 in 4cb2f3a
final var errorByTagName = new HashMap<String, String>(); if (tagNames.size() > candidateRows.size()) { final Set<String> candidateRowNames = candidateRows.stream() .map(TagDeletionCandidateRow::name) .collect(Collectors.toSet()); for (final String tagName : tagNames) { if (!candidateRowNames.contains(tagName)) { errorByTagName.put(tagName, "Tag does not exist"); } } throw TagOperationFailedException.forDeletion(errorByTagName); TagOperationFailedExceptionMapper
maps the exception to aTagOperationFailedProblemDetails
objects that is then returned by the API.- The client then sees this:
dependency-track/src/test/java/org/dependencytrack/resources/v1/TagResourceTest.java
Lines 293 to 300 in 4cb2f3a
{ "status": 400, "title": "Tag operation failed", "detail": "The tag(s) foo could not be deleted", "errors": { "foo": "Tag does not exist" } }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll look into using ProblemDetails since that will make it consistent with how failed tag deletion is handled. Thanks for the tip!
@ApiResponse(responseCode = "401", description = "Unauthorized") | ||
}) | ||
@PermissionRequired(Permissions.Constants.PORTFOLIO_MANAGEMENT) | ||
public Response deleteProjects(List<UUID> uuids) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using @Size(min = 1, max = ...)
here. The limits should then also appear in the OpenAPI spec automatically.
I think there should be some upper limit to the number of UUIDs being submitted in one go. We can always extend it later if people need it, but we can't reduce it later if larger batches end up causing problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. I have tested it with up to 1000 UUIDs, so maybe that could be a reasonable max size to start with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds reasonable.
try (QueryManager qm = new QueryManager()) { | ||
for (Iterator<UUID> it = uuids.iterator(); it.hasNext();) { | ||
UUID uuid = it.next(); | ||
final Project project = qm.getObjectByUuid(Project.class, uuid, Project.FetchGroup.ALL.name()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for using FetchGroup.ALL
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, that doesn't seem to be needed since Project.accessTeams is part of the default fetch group. ProjectQueryManager.hasAccess only needs that property from the project, so I'll change to the getObjectByUuid(Class, UUID) signature instead.
"""); | ||
executeAndCloseWithArray(sqlQuery, (Object) uuidsArray); | ||
|
||
// The below has only been tested with Postgres, but should work on any RDBMS supporting SQL:1999 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For completeness, how many RDBMSes have you tested with this? Having tested at least H2, MSSQL, and PostgreSQL would be good. We have a Docker Compose file for MSSQL here: https://github.com/DependencyTrack/dependency-track/blob/master/dev/docker-compose.mssql.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only tested it with H2 and PostgreSQL so far (H2 in unit tests, PostgreSQL in system tests) but I can try migrating my test data to MSSQL and test that as well. I'll get back to you with the results when done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to do a full load test or anything. We just need assurance that the queries work as expected. Sadly the small differences in SQL syntax between the different RDBMSes turn out to be tripwires more often than not. :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, true, and true :) But I'll need some data in the schema to pass the condition that the UUID list is not empty, so I'll try to migrate the relatively small exported sample that I created with pg_sample.
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
Description
This patch adds a /v1/project/batchDelete method that makes project life-cycle management more effective since multiple projects can be deleted per request.
Addressed Issue
This fixes #3361
Additional Details
This is a modified version of #3407 that instead uses pure SQL to do the deletion. And by doing so, it's much faster:
A caveat is that it uses SQL Common Table Expressions (CTEs) which are part of the ANSI standard since SQL:1999, but still not supported by all DB vendors. The H2 database used for the unit tests has experimental support for CTEs, but does not support CTE DELETE statements. For this reason the code checks the name of the JDBC driver to conditionally run certain statements if the driver is org.postgresql.Driver.
Benchmarks using a local sample DB populated with ~11000 projects imported from a production system:
NB: The effect of the batch size is limited by the checkpoint configuration (see https://www.enterprisedb.com/blog/basics-tuning-checkpoints), but in this test the container DB was running with a default configuration.
Checklist
- [ ] This PR fixes a defect, and I have provided tests to verify that the fix is effective- [ ] This PR introduces changes to the database model, and I have added corresponding update logic- [ ] This PR introduces new or alters existing behavior, and I have updated the documentation accordingly